dbWFA: a web-based database for functional annotation of Triticum aestivum transcripts
نویسندگان
چکیده
The functional annotation of genes based on sequence homology with genes from model species genomes is time-consuming because it is necessary to mine several unrelated databases. The aim of the present work was to develop a functional annotation database for common wheat Triticum aestivum (L.). The database, named dbWFA, is based on the reference NCBI UniGene set, an expressed gene catalogue built by expressed sequence tag clustering, and on full-length coding sequences retrieved from the TriFLDB database. Information from good-quality heterogeneous sources, including annotations for model plant species Arabidopsis thaliana (L.) Heynh. and Oryza sativa L., was gathered and linked to T. aestivum sequences through BLAST-based homology searches. Even though the complexity of the transcriptome cannot yet be fully appreciated, we developed a tool to easily and promptly obtain information from multiple functional annotation systems (Gene Ontology, MapMan bin codes, MIPS Functional Categories, PlantCyc pathway reactions and TAIR gene families). The use of dbWFA is illustrated here with several query examples. We were able to assign a putative function to 45% of the UniGenes and 81% of the full-length coding sequences from TriFLDB. Moreover, comparison of the annotation of the whole T. aestivum UniGene set along with curated annotations of the two model species assessed the accuracy of the annotation provided by dbWFA. To further illustrate the use of dbWFA, genes specifically expressed during the early cell division or late storage polymer accumulation phases of T. aestivum grain development were identified using a clustering analysis and then annotated using dbWFA. The annotation of these two sets of genes was consistent with previous analyses of T. aestivum grain transcriptomes and proteomes. Database URL: urgi.versailles.inra.fr/dbWFA/
منابع مشابه
The wheat (Triticum aestivum L.) leaf proteome.
The wheat leaf proteome was mapped and partially characterized to function as a comparative template for future wheat research. In total, 404 proteins were visualized, and 277 of these were selected for analysis based on reproducibility and relative quantity. Using a combination of protein and expressed sequence tag database searching, 142 proteins were putatively identified with an identificat...
متن کاملPhenotyping, association analysis and annotation of genes related to leaf wilting of bread wheat (Triticum aestivum L.) at the seedling stage under drought stress conditions
Rapid screening of plant germplasm in the early stages of growth and determining the genetic basis of wheat leaf wilting index at the seedling stage is necessary for wheat breeding programs. In the present research, leaf wilting index for 290 Iranian bread wheat genotypes, including; 90 cultivars and 200 landraces were studied under drought stress conditions at the seedling stage in 2021 in res...
متن کاملcDNA Library Enrichment of Full Length Transcripts for SMRT Long Read Sequencing
The utility of genome assemblies does not only rely on the quality of the assembled genome sequence, but also on the quality of the gene annotations. The Pacific Biosciences Iso-Seq technology is a powerful support for accurate eukaryotic gene model annotation as it allows for direct readout of full-length cDNA sequences without the need for noisy short read-based transcript assembly. We propos...
متن کاملSorting the Wheat from the Chaff: Identifying miRNAs in Genomic Survey Sequences of Triticum aestivum Chromosome 1AL
Individual chromosome-based studies of bread wheat are beginning to provide valuable structural and functional information about one of the world's most important crops. As new genome sequences become available, identifying miRNA coding sequences is arguably as important a task as annotating protein coding sequences, but one that is not as well developed. We compared conservation-based identifi...
متن کاملComprehensive Functional Analyses of Expressed Sequence Tags in Common Wheat (Triticum aestivum)
About 1 million expressed sequence tag (EST) sequences comprising 125.3 Mb nucleotides were accreted from 51 cDNA libraries constructed from a variety of tissues and organs under a range of conditions, including abiotic stresses and pathogen challenges in common wheat (Triticum aestivum). Expressed sequence tags were assembled with stringent parameters after processing with inbuild scripts, res...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
دوره 2013 شماره
صفحات -
تاریخ انتشار 2013